# Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model

# 🚨 Visualizations

We provide generated examples of Ctrl-Adapter here: https://ctrladapterexamples.github.io/ 


# 🔧 Setup

### Environment Setup

If you only need to perform inference with our code, please install from ```requirements_inference.txt```. To make our codebase easy to use, the primary libraries that need to be installed are Torch, Diffusers, and Transformers. Specific versions of these libraries are not required; the default versions should work fine :)

If you are planning to conduct training, please install from ```requirements_train.txt``` instead, which contains more dependent libraries needed.


```shell
conda create -n ctrl-adapter python==3.10
conda activate ctrl-adapter
pip install -r requirements_inference.txt # install from this if you only need to perform inference
pip install -r requirements_train.txt # install from this if you plan to do some training
```


Here we list several questions that we believe important when you start using this 

# 🔮 Inference

All inference scripts are put under ```./inference_scripts```.

Here is a sample command to run inference on SDXL with depth map as control (w/ extracted condition).

```
sh inference_scripts/sdxl/sdxl_inference_depth.sh
```

⚠️  ```--control_guidance_end```: this is the most important parameter that balances generated image/video quality with control strength. If you notice the generated image/video does not follow the spatial control well, you can increase this value; and if you notice the generated image/video quality is not good because the spatial control is too strong, you can decrease this value. Detailed discussion of control strength via this parameter is shown in our paper.


# 🚅 How To Train 

🎉 To make our method reproducible and adaptable to new backbones, we have released all of our training code :) 

You can find detailed training guideline for Ctrl-Adapter [here](assets/train_guideline.md)! 

